In [2]:
# import * is a personal choice
from ggplot import *
# our trusty old friends
import pandas as pd
import numpy as np

In [1]:
%matplotlib inline

Built in datasets


In [3]:
meat.head()


Out[3]:
date beef veal pork lamb_and_mutton broilers other_chicken turkey
0 1944-01-01 751 85 1280 89 NaN NaN NaN
1 1944-02-01 713 77 1169 72 NaN NaN NaN
2 1944-03-01 741 90 1128 75 NaN NaN NaN
3 1944-04-01 650 89 978 66 NaN NaN NaN
4 1944-05-01 681 106 1029 78 NaN NaN NaN

5 rows × 8 columns


In [4]:
diamonds.head()


Out[4]:
carat cut color clarity depth table price x y z
0 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43
1 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31
2 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31
3 0.29 Premium I VS2 62.4 58 334 4.20 4.23 2.63
4 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75

5 rows × 10 columns


In [5]:
mtcars.head()


Out[5]:
name mpg cyl disp hp drat wt qsec vs am gear carb
0 Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
1 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
2 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
3 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
4 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2

5 rows × 12 columns


In [7]:
pageviews.head()


Out[7]:
date_hour pageviews
0 2013-02-11 21:00:00 8860.982383
1 2013-02-11 22:00:00 8637.474753
2 2013-02-11 23:00:00 9020.593099
3 2013-02-12 00:00:00 8437.500380
4 2013-02-12 01:00:00 9157.399672

The API

ggplot

ggplot's API revolves around the ggplot class. It's class that behaves much more like a function (you don't really operate on ggplot methods).


In [9]:
?ggplot

ggplots take 2 arguments: a data frame and accompanying "aesthetics" or aes. These are equivalent.


In [6]:
p = ggplot(aes(x='wt'), data=mtcars)

In [7]:
p = ggplot(mtcars, aes(x='wt'))

A ggplot is a "base layer". It won't create any aesthetics but think of it as a canvas. Watch what happens when you render it.


In [8]:
p


Out[8]:
<ggplot: (281151189)>

aes

Aesthetics or aes define how ggplot with extract data from your data frame and render it. Think of it as the instructions for creating x, y, color, etc. components.

aes is just a dictionary with keys being an aesthetic property and values being strings or formulas--for more on formulas read this--relating to data in your data frame.


In [9]:
aes(x='date', y='price')


Out[9]:
{'y': 'price', 'x': 'date'}

In [10]:
# shorthand
aes('date', 'price')


Out[10]:
{u'y': 'price', u'x': 'date'}

In [11]:
# shorthand
aes('date', 'price', 'name')


Out[11]:
{u'y': 'price', u'x': 'date', u'color': 'name'}

In [12]:
# formula
aes(x='date', y='price', color='date * price', shape='factor(name)')


Out[12]:
{'color': 'date * price', 'y': 'price', 'shape': 'factor(name)', 'x': 'date'}

Your first ggplot

So taking everything that we've learned, let's use the mtcars dataset to plot the relationship between car weight (wt) and miles per gallon (mpg). First create a ggplot object with the proper aes and name it p.


In [30]:
p = ggplot(aes(x='wt', y='mpg'), data=mtcars)
p


Out[30]:
<ggplot: (274847889)>

Now let's (quite literally) add a scatterplot (geom_point) to our plot. We'll get into more detail on how this works later.


In [31]:
p + geom_point()


Out[31]:
<ggplot: (275439357)>

In [ ]: